In this Milestone, you will pivot from a mindset of exploratory analysis to one of explanatory analysis. By now you have explored key elements of your data using KPI techniques, and you should have some insights or answers to present. Here is where we will give structure to those insights and answers in a way that creates a clear and compelling presentation of your findings. As mentioned previously, you have two options for your final deliverable - either a single-frame viz (or infographic), or a multi-frame data story using Story Points. To make this decision, you will need to consult your project proposal and your persona document from Milestone 1 and decide which of these formats will best suit the needs of your audience.
import numpy as np
from numpy import count_nonzero, median, mean
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import random
%matplotlib inline
#sets the default autosave frequency in seconds
%autosave 60
sns.set_style('dark')
sns.set(font_scale=1.2)
plt.rc('axes', titlesize=9)
plt.rc('axes', labelsize=14)
plt.rc('xtick', labelsize=12)
plt.rc('ytick', labelsize=12)
import warnings
warnings.filterwarnings('ignore')
pd.set_option('display.max_columns',None)
#pd.set_option('display.max_rows',None)
pd.set_option('display.width', 1000)
pd.set_option('display.float_format','{:.2f}'.format)
random.seed(0)
np.random.seed(0)
np.set_printoptions(suppress=True)
Autosaving every 60 seconds
df = pd.read_csv("clean.csv")
df.head()
| date | quarter | department | day | team | targeted_productivity | smv | wip | over_time | incentive | idle_time | idle_men | no_of_style_change | no_of_workers | actual_productivity | prod_diff | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2015-01-01 | Quarter1 | sweing | Thursday | 8 | 0.80 | 26.16 | 1108.00 | 6960.00 | 50.00 | 0.00 | 0 | 0 | 59.00 | 0.94 | 0.14 |
| 1 | 2015-01-01 | Quarter1 | finishing | Thursday | 1 | 0.75 | 3.94 | 1190.47 | 960.00 | 0.00 | 0.00 | 0 | 0 | 8.00 | 0.89 | 0.14 |
| 2 | 2015-01-01 | Quarter1 | sweing | Thursday | 11 | 0.80 | 11.41 | 968.00 | 3660.00 | 50.00 | 0.00 | 0 | 0 | 30.50 | 0.80 | 0.00 |
| 3 | 2015-01-01 | Quarter1 | sweing | Thursday | 12 | 0.80 | 11.41 | 968.00 | 3660.00 | 50.00 | 0.00 | 0 | 0 | 30.50 | 0.80 | 0.00 |
| 4 | 2015-01-01 | Quarter1 | sweing | Thursday | 6 | 0.80 | 25.90 | 1170.00 | 1920.00 | 50.00 | 0.00 | 0 | 0 | 56.00 | 0.80 | 0.00 |
fig1 = px.scatter(data_frame=df, x="incentive", y="actual_productivity",
title="Incentive vs Productivity",
labels=dict(actual_productivity="Actual Productivity", incentive="Incentive"))
fig2 = px.scatter(data_frame=df, x="no_of_workers", y="actual_productivity",
title="No of Workers vs Productivity",
labels=dict(actual_productivity="Actual Productivity", no_of_workers="No of Workers"))
fig3 = px.line(data_frame=df, x="date", y="actual_productivity", title="Actual Productivity in Jan-Mar 2015",
labels={'actual_productivity': 'Actual Productivity','date':'Date'})
fig1.show()
fig2.show()
fig3.show()
fig1 = px.box(data_frame=df, y="actual_productivity", facet_col="quarter", color="quarter",
title="Actual Productivity Quarterly", labels=dict(actual_productivity='Actual Productivity'))
fig2 = px.box(data_frame=df, y="actual_productivity", facet_col="department", color="department",
title="Actual Productivity By Department", labels=dict(actual_productivity='Actual Productivity'))
fig3 = px.box(data_frame=df, y="actual_productivity", facet_col="day", color="day",
title="Actual Productivity by Day", labels=dict(actual_productivity='Actual Productivity'))
fig4 = px.box(data_frame=df, y="actual_productivity", facet_col="team", color="team", facet_col_wrap=4, height=1200,
title="Actual Productivity by Team", labels=dict(actual_productivity='Actual Productivity'))
fig1.show()
fig2.show()
fig3.show()
fig4.show()